-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data Interpreter Multi-Agent Pipeline #469
base: main
Are you sure you want to change the base?
Data Interpreter Multi-Agent Pipeline #469
Conversation
update update update updated Update di_multiagent.py update update Create README.md Update README.md undo unnecessary changes undo unnecessary changes Update .pre-commit-config.yaml Update .pre-commit-config.yaml Update common.py remove changes in other files Update README.md update updated README and directory name remove Update di_multiagent.py
b01b066
to
83edd23
Compare
Slight modification to the format_instruction to improve the success rate of `RegexTaggedContentParser` parsing the model API output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done! My only concern is the modification within the react agent. Please see inline comments for details.
src/agentscope/agents/react_agent.py
Outdated
@@ -109,7 +109,7 @@ def __init__( | |||
|
|||
# Initialize a parser object to formulate the response from the model | |||
self.parser = RegexTaggedContentParser( | |||
format_instruction="""Respond with specific tags as outlined below: | |||
format_instruction="""Respond with specific tags as outlined below in json format: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to add the requirement "in json format" here?
For string type, the json format is
<thought>'this is a test'</thought>
while the non-json format is
<thought>this is a test</thought>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my testing, without adding 'in json format', RegexTaggedContentParser
cannot successfully parse boolean variables or None
(will treat them as strings) and hence cannot be recognized correctly by tools like 'execute_python_code'. E.g., if the RegexTaggedContentParser
returns a string None
, if use_docker is None
will be false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I have modified my PR to override RegexTaggedContentParser
's format_instruction
within my example. I would caution against using RegexTaggedContentParser
when try_parse_json=True
without explicit instruction of adhering the arguments of interest to JSON format in format_instruction
, since LLMs will likely output contents in other formats and cause json.loads
to fail.
…reinforce the output adhere to json format within this example
…o reinforce the output adhere to json format within this example.
|
||
This step enables the executed code by the agents to perform required operations that are otherwise restricted by default. Ensure you understand the security implications of modifying these restrictions. | ||
|
||
- Comment out the following in `src/agentscope/utils/common.py`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we have specific working directory here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original create_tempdir
creates a temporary directory and changes the current working directory to it; however, it does not persist across different runs of code execution. My solution is to use the current directory for executing code and preserving the execution results.
from agentscope.service.service_status import ServiceExecStatus | ||
|
||
|
||
def read_csv_file(file_path: str) -> ServiceResponse: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Group this read_csv_file
with the one in di_multiagent.py
into a util.py file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up not using read_csv_file
and write_csv_file
(the agent is able to use code to achieve the same result), and I observed that giving the agent too many tools can have a detrimental effect on choosing the correct tool to use, so I have removed read_csv_file
and write_csv_file
from both scripts.
"dependent_task_ids": list[str] = "ids of tasks prerequisite to this task", | ||
"instruction": "what you should do in this task, one short phrase or sentence", | ||
"task_type": "type of this task, should be one of Available Task Types", | ||
"task_type": "type of this task, should be one of Available Task Types", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Duplicate lines?
- what are the
Available Task Types
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Duplicates removed.
- Since this framework is designed to be generic, there are actually no pre-defined Available Task Types.
task_type
is optional since its only purpose at the moment is to give the solver a high-level idea of what kind of tool to employ. I have corrected the relevant prompts.
"dependent_task_ids": list[str] = "ids of tasks prerequisite to this task", | ||
"instruction": "what you should do in this task, one short phrase or sentence", | ||
"task_type": "type of this task, should be one of Available Task Types", | ||
"task_type": "type of this task, should be one of Available Task Types", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same questions as in the comment above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as the above
"dependent_task_ids": list[str] = "ids of tasks prerequisite to this task", | ||
"instruction": "what you should do in this task, one short phrase or sentence", | ||
"task_type": "type of this task, should be one of Available Task Types", | ||
"task_type": "type of this task, should be one of Available Task Types", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same questions as the comments above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as the above
) | ||
|
||
|
||
_ = load_dotenv(find_dotenv()) # read local .env file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why here need to have load_dotenv?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line locates the .env file and load_dotenv() reads the .env file and updates os.environ. Without it the API keys won't be loaded properly.
name: Pull Request
about: Create a pull request
Description
An example to demonstrate the problem-solving capability of multiagent LLMs with adaptive planning on various tasks.
Checklist
Please check the following items before code is ready to be reviewed.